智能论文笔记

Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques

JunKyu Lee , Lev Mukhanov , Amir Sabbagh Molahosseini , Umar Minhas , Yang Hua , Jesus Martinez del Rincon , Kiril Dichev , Cheol-Ho Hong , Hans Vandierendonck

分类：机器学习

2021-12-30

我们日常生活中的深度学习是普遍存在的，包括自驾车，虚拟助理，社交网络服务，医疗服务，面部识别等，但是深度神经网络在训练和推理期间需要大量计算资源。该机器学习界主要集中在模型级优化（如深度学习模型的架构压缩），而系统社区则专注于实施级别优化。在其间，在算术界中提出了各种算术级优化技术。本文在模型，算术和实施级技术方面提供了关于资源有效的深度学习技术的调查，并确定了三种不同级别技术的资源有效的深度学习技术的研究差距。我们的调查基于我们的资源效率度量定义，阐明了较低级别技术的影响，并探讨了资源有效的深度学习研究的未来趋势。

translated by 谷歌翻译

Bounding the Last Mile: Efficient Learned String Indexing

Benjamin Spector , Andreas Kipf , Kapil Vaidya , Chi Wang , Umar Farooq Minhas , Tim Kraska

分类：机器学习

2021-11-29

我们介绍了用于有效索引字符串的RadixStringsPline（RSS）学习索引结构。RSS是一个基数树，每个索引固定数量的字节。RSS方法或超过传统字符串索引的性能，同时使用7-70 $ \ times $少的内存。RSS通过使用最小的字符串前缀来实现这一目标，以充分区分数据与索引整个字符串的大多数探测方法不同的数据。此外，RSS的界限错误性质加速了最后一英里的搜索，也可以启用内存有效的哈希表查找加速器。我们对艺术和热门的几个真实弦乐数据集进行基准RSS。我们的实验表明，这种研究线可能对未来的内存密集型数据库应用有望。

translated by 谷歌翻译

SynCLay: Interactive Synthesis of Histology Images from Bespoke Cellular Layouts

Srijay Deshpande , Muhammad Dawood , Fayyaz Minhas , Nasir Rajpoot

分类：计算机视觉 | 机器学习

2022-12-28

Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task.

translated by 谷歌翻译

RFID-Cloud Integration for Smart Management of Public Car Parking Spaces

Umar Yahya , Ndawula Noah , Asingwire Hanifah , Lubega Faham , Abdal Kasule , Hamisi Ramadhan Mubarak

分类：人工智能 | 机器人

2022-12-25

Effective management of public shared spaces such as car parking space, is one challenging transformational aspect for many cities, especially in the developing World. By leveraging sensing technologies, cloud computing, and Artificial Intelligence, Cities are increasingly being managed smartly. Smart Cities not only bring convenience to City dwellers, but also improve their quality of life as advocated for by United Nations in the 2030 Sustainable Development Goal on Sustainable Cities and Communities. Through integration of Internet of Things and Cloud Computing, this paper presents a successful proof-of-concept implementation of a framework for managing public car parking spaces. Reservation of parking slots is done through a cloud-hosted application, while access to and out of the parking slot is enabled through Radio Frequency Identification (RFID) technology which in real-time, accordingly triggers update of the parking slot availability in the cloud-hosted database. This framework could bring considerable convenience to City dwellers since motorists only have to drive to a parking space when sure of a vacant parking slot, an important stride towards realization of sustainable smart cities and communities.

translated by 谷歌翻译

IoT-Based Pothole Mapping Agent with Remote Visualization

Umar Yahya , Mwaka Lucky , Muhammed Mansoor , Nankabirwa Sharifah , Abdal Kasule , Kasagga Usama

分类：机器人

2022-12-25

Driving through pothole infested roads is a life hazard and economically costly. The experience is even worse for motorists using the pothole filled road for the first time. Pothole-filled road networks have been associated with severe traffic jam especially during peak times of the day. Besides not being fuel consumption friendly and being time wasting, traffic jams often lead to increased carbon emissions as well as noise pollution. Moreover, the risk of fatal accidents has also been strongly associated with potholes among other road network factors. Discovering potholes prior to using a particular road is therefore of significant importance. This work presents a successful demonstration of sensor-based pothole mapping agent that captures both the pothole's depth as well as its location coordinates, parameters that are then used to generate a pothole map for the agent's entire journey. The map can thus be shared with all motorists intending to use the same route.

translated by 谷歌翻译

RANA: Relightable Articulated Neural Avatars

Umar Iqbal , Akin Caliskan , Koki Nagano , Sameh Khamis , Pavlo Molchanov , Jan Kautz

分类：计算机视觉

2022-12-06

We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting. We only require a short video clip of the person to create the avatar and assume no knowledge about the lighting environment. We present a novel framework to model humans while disentangling their geometry, texture, and also lighting environment from monocular RGB videos. To simplify this otherwise ill-posed task we first estimate the coarse geometry and texture of the person via SMPL+D model fitting and then learn an articulated neural representation for photorealistic image generation. RANA first generates the normal and albedo maps of the person in any given target body pose and then uses spherical harmonics lighting to generate the shaded image in the target lighting environment. We also propose to pretrain RANA using synthetic images and demonstrate that it leads to better disentanglement between geometry and texture while also improving robustness to novel body poses. Finally, we also present a new photorealistic synthetic dataset, Relighting Humans, to quantitatively evaluate the performance of the proposed approach.

translated by 谷歌翻译

PhysDiff: Physics-Guided Human Motion Diffusion Model

Ye Yuan , Jiaming Song , Umar Iqbal , Arash Vahdat , Jan Kautz

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-05

Denoising diffusion models hold great promise for generating diverse and realistic human motions. However, existing motion diffusion models largely disregard the laws of physics in the diffusion process and often generate physically-implausible motions with pronounced artifacts such as floating, foot sliding, and ground penetration. This seriously impacts the quality of generated motions and limits their real-world application. To address this issue, we present a novel physics-guided motion diffusion model (PhysDiff), which incorporates physical constraints into the diffusion process. Specifically, we propose a physics-based motion projection module that uses motion imitation in a physics simulator to project the denoised motion of a diffusion step to a physically-plausible motion. The projected motion is further used in the next diffusion step to guide the denoising diffusion process. Intuitively, the use of physics in our model iteratively pulls the motion toward a physically-plausible space. Experiments on large-scale human motion datasets show that our approach achieves state-of-the-art motion quality and improves physical plausibility drastically (>78% for all datasets).

translated by 谷歌翻译

Comparison of semi-supervised learning methods for High Content Screening quality control

Umar Masud , Ethan Cohen , Ihab Bendidi , Guillaume Bollot , Auguste Genovesio

分类：计算机视觉 | 机器学习

2022-08-09

自动显微镜和定量图像分析的进展已促进了高含量筛查（HCS）作为有效的药物发现和研究工具。尽管HCS提供了高吞吐量图像的复杂细胞表型，但该过程可能会受到图像畸变的阻碍，例如异常图像模糊，荧光团饱和度，碎屑，高噪声，高水平的噪声，意外的自动荧光或空的图像。尽管此问题在文献中受到了温和的关注，但忽略这些人工制品会严重阻碍下游图像处理任务，并阻碍对细微表型的发现。因此，在HCS中使用质量控制是主要问题，也是先决条件。在这项工作中，我们评估了不需要大量图像注释的深度学习选项，即可为此问题提供直接且易于使用的半监督学习解决方案。具体而言，我们比较了最近的自我监督和转移学习方法的功效，以提供高吞吐量伪像图像检测器的基础编码器。这项研究的结果表明，对于此任务，应首选转移学习方法，因为它们不仅在这里表现出色，而且具有不需要敏感的超参数设置或大量额外培训的优势。

translated by 谷歌翻译

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion

Muhammad Umar Farooq , Darshan Adiga Haniya Narayana , Thomas Hain

分类：自然语言处理

2022-07-07

多语言语音识别已引起大幅关注，作为补偿低资源语言数据稀缺性的有效方法。端到端（E2E）建模比常规混合系统优选，这主要是由于没有词典要求。但是，在有限的数据方案中，混合DNN-HMM仍然优于E2E模型。此外，手动词典创建的问题已通过公开训练的素式训练型（G2P）（G2P）和多种语言的IPA音译来缓解。在本文中，在低资源语言的多语言设置中提出了一种混合DNN-HMM声学模型的新型方法。针对目标语言语言信号的不同单语言模型的后验分布融合在一起。为每个源目标语言对训练了一个单独的回归神经网络，以将后者从源声学模型转换为目标语言。与ASR培训相比，这些网络需要非常有限的数据。与多语言和单语基线相比，后融合的相对增益分别为14.65％和6.5％。跨语性模型融合表明，无需使用依赖语言的ASR的后代，就可以实现可比的结果。

translated by 谷歌翻译

Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition

Muhammad Umar Farooq , Thomas Hain

分类：自然语言处理

2022-07-07

多语言自动语音识别（ASR）系统大多受益于低资源语言，但相对于单语言对应物，多种语言的性能下降。有限的研究集中在理解多语言语音识别设置中的语言行为。在本文中，提出了一种新型的数据驱动方法来研究跨语性的声学表达相似性。该技术衡量了各种单语言模型与目标语音信号的后验分布之间的相似性。深度神经网络被训练为映射网络，以将分布从不同的声学模型转换为直接比较的形式。分析观察到，语言接近性无法通过集合音素的体积真正估计。对拟议的映射网络的熵分析表明，具有较小重叠的语言可以更适合跨语性转移，因此在多语言设置中更有益。最后，提出的后验变换方法被利用为目标语言的单语模型融合。比单语言对应物的相对提高约为8％。

translated by 谷歌翻译